Data Model for Document Transformation and Assembly ( Extended
نویسنده
چکیده
This paper shows a data model for transforming and assembling document information such as SGML or XML documents. The biggest advantage over other data models is that this data model simultaneously provides (1) powerful patterns and contextual conditions, and (2) schema transformation. Patterns and contextual conditions capture conditions on subordinates and those on superiors, siblings, subordinates of siblings, etc, respectively, and have been recognized as highly important mechanisms for identifying document components in the document processing community. Meanwhile, schema transformation has been, since the RDB, recognized as crucial in the database community. However, no data models have provided all three of patterns, contextual conditions, and schema transformation. This data model is based on the forest-regular language theory. A schema is a forest automaton and an instance is a nite set of forests (sequences of trees). Since the parse tree set of an extended-context free grammar is accepted by a forest automaton, this model is a generalization of Gonnet and Tompa's grammatical model. Patterns are captured as forest automatons; contextual conditions are pointed forest representations (a variation of Podelski's pointed tree representations). Controlled by patterns and contextual conditions, an operator creates an instance from an input instance and also creates a reasonably small schema from an input schema. Furthermore, the created schema is often minimally su cient; any forest permitted by it may be generated by some input instance.
منابع مشابه
Provide a model for the establishment of the school in accordance with the indicators and requirements of the Education Transformation Document
Purpose: The aim of this study was to provide a model for school establishment in accordance with the indicators and requirements of the Education Transformation Document. Methodology: The research method was basic-applied in terms of purpose, descriptive-survey in terms of data collection method and combined in terms of data type. The statistical population of the study in the qualitative sect...
متن کاملDesign, implementation and evaluation of faculty empowerment program in the field of virtual education the document the transformation and innovation of Medical education an application of the Harden curriculum planning model
This article has no abstract.
متن کاملStrategies for promoting the Supervisory board Subject of Article 6 of the Registration Law Emphasizing the Transformation Document of the Judiciary
Abstract The Supervisory Board (Article 6 of the Law on the Registration of Deeds and Property) is the authority to deal with disputes and errors regarding the registration of documents and property. This reference lacks a procedure. The current method of handling this reference is incomplete and contrary to the policy of reducing the work of the court. If we want to make minor reforms in the ...
متن کاملرفع اعوجاج هندسی متون بهکمک اطلاعات هندسی خطوط متن
Document images produced by scanners or digital cameras usually have photometric and geometric distortions. If either of these effects distorts document, recognition of words from such a document image using OCR is subject to errors. In this paper we propose a novel approach to significantly remove geometric distortion from document images. In this method first we extract document lines from do...
متن کاملA Data Focusing method for Microwave Imaging of Extended Targets
This paper presents a data focusing method (DFM) to image extended targets using the multiple signal classification (MUSIC) algorithm. The restriction on the number of transmitter-receiver antennas in a microwave imaging system deteriorates profiling an extended target that comprises many point scatterers. Under such situation, the subspace-based linear inverse scattering methods, like the MUSI...
متن کامل